Plots of all sorts follow. Feel free to cannibalize anything you like.
library(tidyverse)
library(gplots)
library(corrplot) #correlation matrices and correlograms
library(tibble) #clean little tibbles
library(RColorBrewer) #creates custom color palettes
library(splitstackshape) #generates ID variables, crazy useful little tool.
library(ggdendro) #dendrograms
library(TTR) #smoothing and simple moving averages for time series
library(psych) #describe_by
library(igraph) #used here for converting hclust to igraph object (vector(s) of edges)
library(ggraph) #plots cluster dendrogram as graph/network
library(dendextend) #the mighty dendrogram
library(ggrepel) #jitters points and labels a wee bit
library(DescTools) #reorders factors
library(stats)
library(stringr)
library(viridis)
library(janitor) #clean variable names
library(ggthemes)
library(ggrepel)
Sometimes reference lines are very useful for examining thresholds of impairment, shapes of trends, and other important aspects of the data. Let’s create some fake data highlighting where a critical observation (Dr. No) falls within an X-Y scatterplot of height and weight.
set.seed(989)
a <- letters[seq(from = 1, to = 26)]
b <- sample(100, 26)
c <- sample(100, 26)
d <- data.frame(a, b, c)
newnames <- c("obs", "height", "weight")
names(d) <- newnames
d$obs <- as.factor(d$obs)
e <- d %>% mutate(group = ifelse(obs == "n", 1, 0)) #if observation 'n' then group 1, all else group 0
myvec <- c("gray85", "darkred")
e$group <- as.factor(e$group)
ggplot(e, aes(height, weight, color = group)) + geom_point(size = 3, alpha = 0.6) +
scale_color_manual(values = myvec) + ylab("weight") + ylab("weight") + theme_classic() +
ggtitle("Isolating Dr. No")
p <- ggplot(e, aes(height, weight, color = group)) + geom_point(size = 3, alpha = 0.6) +
scale_color_manual(values = myvec) + ylab("weight") + ylab("weight") + theme_classic() +
ggtitle("Label Dr. No") + geom_label_repel(aes(label = obs))
print(p) #labels and repels points
Figure out Dr. No’s X and Y coordinates. Then zero in on him in two possible ways. The geom_segment method is a bit of a pita, but it allows precise control of endpoints. I’ll be honest though. I usually just port plots over to Illustrator for final touches on annotation.
e %>% filter(obs == "n") %>% print() #prints just row corresponding to obervation n
## obs height weight group
## 1 n 13 65 1
p + geom_hline(yintercept = 65, linetype = "dashed", color = "darkred") + geom_vline(xintercept = 13,
linetype = "dashed", color = "darkred") + ggtitle("Method 1: Hline & Vline") +
annotate("text", x = 30, y = 69, label = "Here's Dr. No", color = "darkred",
size = 6) #annotate using vline and hline
p + geom_segment(x = 13, y = 0, xend = 13, yend = 65, linetype = "dashed", color = "blue") +
geom_segment(x = 0, y = 65, xend = 13, yend = 65, linetype = "dashed", color = "blue") +
ggtitle("Method 2: Geom_Segment") #each line segment is a separate geom
This scatterplot is for a project on physiological responses to profanity using pupillometry. I used the repel function from ggrepel to nudge the labels so that there is less overlap.
profane <- read.csv("profane.csv", header = T)
ggplot(profane, aes(Social_Accept, Arousal, color = Condition)) + geom_point(size = 3) +
ylab("Physiological Arousal") + xlab("Social Acceptibility") + scale_color_manual(values = c("black",
"green")) + geom_text_repel(aes(label = Word), color = "black", size = 3) + theme(legend.position = c(0.85,
0.85)) + theme(legend.background = element_rect(fill = "white", color = "black")) +
theme(axis.line = element_line(colour = "black"), panel.border = element_blank(),
panel.background = element_blank()) + theme(panel.grid.minor = element_line(colour = "gray",
size = 0.1))
Adjustments (limits, breaks, etc.). ggplot2 can be a bit challenging regarding axes. Let’s first generate a dataframe populated with randomly sampled data from 1-100 without replacement. Then plot it.
set.seed(999) #fixed random sampling
dat <- data.frame(replicate(2, sample(0:100, 100, rep = F))) #selection without replacement
baseplot <- ggplot(dat, aes(X1, X2)) + geom_point(shape = 21, fill = "blue", size = 2.3,
alpha = 0.6) + jamie.theme + ylab(NULL) + xlab(NULL)
print(baseplot)
Yuck. We need finer-grained notation on both axes. Add breaks every 10.
newplot <- baseplot + scale_x_continuous(breaks = seq(0, 100, 10), limits = c(0,
100)) + scale_y_continuous(breaks = seq(0, 100, by = 10))
print(newplot) #scale continuous is seq(from,to,by)
‘xlim’ & ‘ylim’ can cut off data. Tread lightly. Here is what xlim=50 looks like.
smaller <- baseplot + xlim(c(0, 50))
print(smaller)
Coord_cartesian zooms in on a specific range without cutting/eliminating data
focused <- baseplot + coord_cartesian(xlim = c(0, 25), ylim = c(0, 50))
print(focused)
Boxplots. It’s easy to forget that the crossbar reflects the median not the mean. The components of the box are Q1, Q2 (median), Q3. The whiskers represent the min, max. Let’s construct a boxplot from scratch from a bogus sample composed of 5 observations (e.g., kids shoe sizes): 3,4,5,6,7. Stat 101: The box is defined by the interquartile range (Q2-Q3). The crossbar is the median (here it’s 5). The whiskers represent max and min (that are not outliers). The trick to getting the whiskers here is to generate the error bar and then plop the boxplot over it. The width on the box and the whispers should match.
seq7 <- data.frame(shoe.size = seq(3, 7, 1)) #sequence from 3-7, by 1's
ggplot(seq7, aes(x = "", y = shoe.size)) + stat_boxplot(geom = "errorbar", width = 0.3) +
xlab(NULL) + scale_y_continuous(breaks = seq(0, 10, 2), limits = c(0, 10)) +
geom_boxplot(fill = "blue", width = 0.3) + jamie.theme
Here’s a violin plot reflecting salience of 40k English words by vision, olfaction, taste, touch, audition from the Lancaster Sensorimotor Norms. I added a crossbar representing the mean of each distribution.
sense <- read.csv("LancasterNormsShort.csv", header = T)
sense.long <- sense %>% pivot_longer(2:6, names_to = "Sensory", values_to = "Likert")
sense.long$Sensory <- factor(sense.long$Sensory, levels = c("Visual", "Auditory",
"Haptic", "Olfactory", "Gustatory"))
ggplot(sense.long, aes(x = Sensory, y = Likert, fill = Sensory)) + xlab("Sensory") +
scale_y_continuous(breaks = seq(0, 5, 1), limits = c(0, 5)) + geom_violin() +
jamie.theme + ylab("Mean Likert Scale Value") + xlab("Sensory Modality") + stat_summary(fun.y = mean,
fun.ymin = mean, fun.ymax = mean, geom = "crossbar", color = "black", size = 0.4,
width = 0.3)
Barplots aren’t great in terms of representing variance within a particular distribution. There are better alternatives like dotplots or violin plots. Nevertheless, it is sometimes helpful to include a barplot. Here’s one plotting pupil dilation differences as a function of reading curse words relative to neutral words. This takes a bit of fenagling using the melt function. Then create the barplot you need to make sure you are plotting means (if that’s what you want to do) rather than “identity”. It is also crucial to dodge the bars unless you want a stacked bar chart. Finally, you need stat_summary if you want to calculate and add error bars on the fly. This can be tricky to in terms of dodging and specifying the width of the error bars. I like the little slashes on the legend. You get those by passing “show_guide=T” to the geom_bar code.
dat.z <- read.csv("bpeak.csv", header = T)
dat2 <- dat.z %>% pivot_longer(3:4, names_to = "WordCond", values_to = "Dilation")
colvec <- c("gray", "green")
ggplot(dat2, aes(x = StimCond, y = Dilation, fill = WordCond)) + geom_bar(position = "dodge",
stat = "summary", fun.y = "mean") + scale_fill_manual(values = colvec) + stat_summary(fun.data = mean_se,
geom = "errorbar", width = 0.2, position = position_dodge(0.9)) + jamie.theme
Dr. Joshua Troche crowdsourced ratings from 300+ people on 16 semantic dimensions for 750 English words. The correlation matrices and plots to follow reflect the strength of the relationship(s) between factors like sound salience, visual form, loudness. For example, sound and visual form tend to be highly correlated when considering words like ‘dog’. Josh published this work in Frontiers in Human Neuroscience. Coerce the first column to rownames or else you will get a big fat error when R tries to correlate a string with an integer.
MDS_Corr <- read.csv("JoshCorr.csv", header = T, row.names = 1)
joshcor <- cor(MDS_Corr, use = "complete.obs") #compute correlation matrix using the full dataset
corrplot(joshcor, method = "number", diag = F, type = "upper", bg = "palegreen4",
cl.pos = "n", number.cex = 0.9, col = "black", outline = T, number.digits = 2,
addgrid.co = "black")
Color scaled using the viridis palette
corrplot(joshcor, method = "shade", col = viridis(12), shade.col = NA, tl.col = "black",
tl.srt = 45)
Circles are scaled to indicate the strength of the correlation. Colors indicate the direction of the correlation (blue is negative, red is positive). Since this is a symmetrical matrix, I only included the upper diagonal.
josh <- read.csv("Exp1_Corrplot.csv", header = T)
corrmat <- cor(josh, use = "complete.obs") #create a correlation matrix
colnew <- colorRampPalette(c("blue", "white", "red"))(10) #creates a color palette ranging from blue to red in increments of 10
corrplot(corrmat, method = "circle", type = "upper", tl.srt = 45, tl.cex = 1, col = viridis(12),
tl.col = "black", diag = F, order = "hclust")
The first column in the original data contain words. We need to convert these to rownames when we work with correlation matrices and corrplots. This yields a scatterplot matrix.
all <- read.csv("db_glascow.csv", header = T, row.names = 1) %>% clean_names() #need this row.names command
smaller <- all %>% select(3, 6:8) #select valence, imageability, age of acquisition, familiarity
# smaller, the select function is also used by another package, do you need to
# specify dplyr
pairs.panels(smaller[, ], method = "pearson", hist.col = "#00AFBB", density = TRUE)
Nobody loves dendrograms more than my recently graduated MA thesis student, Helen Felker, and here’s one of her favorites. This ‘base’ unadorned cluster dendrogram represents semantic relations between words (N=75) rated in Spanish by a Spanish-English bilingual speaker on color, sound, emotion, size, time salience. Here’s the setup.First convert the original data to a distance matrix (Euclidean), then hclust it using Ward’s Method. Then ‘as.dendrogram’ it. Then plot it. Here I struggled a bit with getting the branches of the dendrogram relatively homogeneous in length. The par(mar) function messes with plotting output using four parameters: bottom, left, top, and right. The default is c(5.1, 4.1, 4.1, 2.1).
raw <- read.csv("Biling_Master_v10.csv", header = T)
raw_2 <- raw %>% select(8:9, 22:27) #isolate only the columns of interest- dimensions, condition, participant
sp <- raw_2 %>% filter(Tag == "sp") %>% select(2:8) #filters rows only corresponding to one condition 'sp' - spanish bilinguals
sp_means <- sp %>% group_by(Stim) %>% summarise_all(funs(mean(., na.rm = TRUE))) %>%
as.data.frame()
sp2 <- sp_means[, -1] #create a new dataframe minus the first column (eliminate what will be rownames)
rownames(sp2) <- sp_means[, 1] #coerce the first column (Word) into rownames of the dataframe
sp.dist <- dist(sp2) #to distance matrix, runs hierarchical cluster analysis using complete linkage
sp.clust <- hclust(sp.dist, method = "ward.D")
sp.dend <- as.dendrogram(sp.clust) #plt.dend <- ggdendrogram(sp.dend, rotate=TRUE)
plt.v1 <- sp.dend %>% set("labels_cex", 0.8) %>% set("branches_lwd", 1.5) %>% set("leaves_cex",
1.5) %>% set("labels_col", "blue") %>% raise.dendrogram(2)
plot(plt.v1, horiz = T)
Same dendrogram plotted as a triangle. There’s lots of extra ‘junk’ space at the top. If we focus on the lower leaves, we might be able to discern a bit more structure.
plt.tri <- sp.dend %>% set("labels_cex", 0.8) %>% set("branches_lwd", 1.5) %>% set("leaves_cex",
1.5) %>% set("labels_col", "blue") %>% raise.dendrogram(2)
plot(plt.tri, horiz = F, type = "triangle")
Zoom in on the previous dendrogram, Cut at 20. Show small leaf & branch nodes.
plt.tri2 <- sp.dend %>% set("labels_cex", 0.8) %>% set("branches_lwd", 1.25) %>%
set("leaves_cex", 1.25) %>% set("labels_col", "blue") %>% set("nodes_pch", 19) %>%
set("nodes_cex", 0.5)
# plot(plt.tri2, horiz=F, type='triangle')
plot(cut(plt.tri2, 20)$lower[[2]], horiz = F, type = "triangle")
Oh the beauty of a nice heatmap. Here’s one using our semantic data where I’ve played with the color spectrum varying it from red (high) to blue (low) by way of yellow in 299 hue increments. These data represent average Likert-scale ratings on multiple semantic dimensions (color, sound, etc.) for three English words (kite, alligator, reputation). Prep the dataframe by changing it to a matrix, coercing column one to rownames
dat.t <- read.csv("Tiles2.csv", header = T)
dat.d <- as.data.frame(dat.t)
dat.1 <- dat.d[, -1] #create a new dataframe minus the first column (eliminate what will be rownames)
rownames(dat.1) <- dat.d[, 1]
mat.1 <- as.matrix(dat.1)
jamie.colors <- colorRampPalette(c("yellow", "red"))(n = 299)
heatmap.2(mat.1, key.title = NA, dendrogram = "none", trace = "none", col = jamie.colors,
density.info = "none", margins = c(8, 8), cexRow = 1.5)
Sort of like an interpolated heat map
all <- read.csv("db_glascow.csv", header = T, row.names = 1) %>% clean_names()
all$gls_img <- as.integer(all$gls_img)
all$gls_arousal <- as.integer(all$gls_arousal)
all$gls_aoa <- as.integer(all$gls_aoa)
ggplot(all, aes(gls_img, gls_arousal, z = gls_aoa)) + geom_contour_filled(binwidth = 0.2) +
xlim(2, 6) + labs(fill = "AoA") + jamie.theme
Here’s something you don’t see every day. This histogram reflects counts from Likert-Scale ratings of the quality of pairing common English nouns (e.g., rocket) with taboo words (e.g., shit) to form new taboo compounds (e.g., shitrocket). In this particular histogram, I changed the binwidth to .10 and the y-axis to raw counts. Histogram 2 plots an overlay of a normal curve from a set of psycholinguistic norms.
profexp2 <- read.csv("ProfanityExp2_CleanR.csv", header = T)
Lancaster <- read.csv("LancasterShort.csv", header = T)
ggplot(profexp2, aes(x = Qualtrics)) + geom_histogram(colour = "black", fill = "goldenrod2",
binwidth = 0.1) + xlab("Average Likert Scale Rating") + ylab("Word Count Per Bin") +
jamie.theme + ggtitle("profane noun compounds")
ggplot(Lancaster, aes(x = Olfactory)) + geom_histogram(aes(y = ..density..), colour = "black",
fill = "green", binwidth = 0.1) + xlab("Likert Rating") + ylab("Density") + jamie.theme +
ggtitle("dnorm gaussian overlay") + stat_function(fun = dnorm, args = list(mean = mean(Lancaster$Olfactory),
sd = sd(Lancaster$Olfactory)), color = "black", size = 1)
A density plot is like a smoothed histogram. They’re quite pretty. Let’s say you want to examine two distributions, each composed of 1000 observations. It’s Philly kids vs NYC kids on some crazy standardized test. Let’s say the Philly kids score a mean of 80 (sd=2), whereas NYC kids score a mean of 90 (sd=5). Here’s our fake data setup.
set.seed(123)
philly <- rnorm(1000, 80, 2)
set.seed(1234)
nyc <- rnorm(1000, 87, 5)
all <- data.frame(cbind(philly, nyc))
phlnyc <- all %>% pivot_longer(1:2, names_to = "city", values_to = "test.score") #to long form
col.vec <- c("goldenrod2", "blue")
ggplot(phlnyc, aes(x = test.score, fill = city)) + geom_histogram(aes(y = ..count..),
colour = "black", binwidth = 0.5) + scale_fill_manual(values = col.vec) + xlab("Test.Score") +
ylab("Count") + facet_wrap(~city) + theme_bw()
Hmm… but we want to eyeball the two distributions in the same ‘space’. Enter density plot. Here are two attempts, the second much better than the first by adjusting the alpha
ggplot(phlnyc, aes(x = test.score, fill = city)) + geom_density(colour = "black") +
scale_fill_manual(values = col.vec) + xlab("Test.Score") + ylab("Count") + jamie.theme +
ggtitle("Nasty")
ggplot(phlnyc, aes(x = test.score, fill = city)) + geom_density(alpha = 0.5, colour = "white") +
scale_fill_manual(values = col.vec) + xlab("Test.Score") + ylab("Count") + jamie.theme +
ggtitle("Pretty")
Just means across time. I eliminated the x-axis line and redrew it as dashed.
lvppa <- read.csv("ancdsDat.csv", header = T)
ggplot(lvppa, aes(x = lvppa$Month, y = lvppa$Accuracy, colour = lvppa$Item)) + geom_line(size = 1) +
theme_bw() + theme(axis.line = element_line(colour = "black"), panel.grid.minor = element_blank(),
panel.border = element_blank(), panel.background = element_blank()) + coord_cartesian(ylim = c(0,
1.1)) + geom_hline(aes(yintercept = 0), linetype = "dashed") + theme(axis.line.x = element_blank()) +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) +
ylab("% Accuracy") + xlab("Time (months)") + geom_point(shape = 17, size = 3) +
labs(colour = "")
These curves reflect three response functions of baseline corrected change in pupil size (in mm) elicited during a word monitoring task when people stare at either an unchanging gray screen in darkness, mid-level luminance, or obnoxiously bright light. The data are in long form, factors are reordered (dark, mid, bright).
summary.words <- read.csv("Exp2_RSetupPostPipeline_1.0.csv", header = T)
summary.words$lightCond <- factor(summary.words$lightCond, levels = c("dark", "mid",
"bright"))
newvec <- c("black", "red", "yellow")
ggplot(summary.words, aes(x = bin, y = raw.diff, color = lightCond, fill = lightCond)) +
geom_smooth(method = loess, alpha = 0.4) + scale_colour_manual(values = newvec) +
scale_fill_manual(values = newvec) + coord_cartesian(ylim = c(-0.05, 0.2)) +
coord_cartesian(ylim = c(-0.05, 0.2)) + ylab("Absolute Pupil Dilation (mm)") +
xlab("Event Duration (bin)") + scale_x_continuous(breaks = pretty(summary.words$bin,
n = 12)) + jamie.theme
raw <- read.csv("Biling_Master_v10.csv", header = T)
raw_cond <- raw %>% select(3, 13, 22:27)
cond_piv <- raw_cond %>% pivot_longer(3:8, names_to = "Cond", values_to = "value") #to long form
colslope <- c("green", "gray") #creates a vector of colors to pass to aes
ggplot(cond_piv, aes(x = CNC.av, y = value, fill = Condition, color = Condition)) +
geom_smooth(method = "loess") + scale_fill_manual(values = colslope) + scale_color_manual(values = colslope) +
facet_wrap(~Cond, ncol = 2) + xlab("concreteness") + ylab("likert rating") +
jamie.theme
Line graphs representing group level pupillary responses
pupcurse <- read.csv("exp1_rev.csv", header = T)
piv <- pupcurse %>% pivot_longer(6:45, names_to = "time_bin", values_to = "diameter")
piv$BinSeq <- piv$time_bin %>% str_replace_all("Bin", " ") %>% as.numeric() #replace chr sequence 'bin' with nothing
ggplot(piv, aes(x = BinSeq, y = diameter, color = Condition)) + stat_summary(fun = mean,
na.rm = T, geom = "line", size = 0.5) + scale_colour_manual(values = c("green",
"black", "red")) + stat_summary(fun.data = mean_se, geom = "pointrange", size = 0.2) +
theme_bw() + ylab("Pupil Diameter Change (mm)") + xlab("Time Bin") + theme(legend.position = c(0.85,
0.85)) + jamie.theme + ggtitle("Line Graph w Error Bars")
These pupil response functions reflect pupil diameter to Yes/No responses when people think of darkness (dark cave) associated with “No” and brightness (sunny day) associated with “Yes”.
p <- read_csv("BrightDark.csv")
pupiv <- p %>% pivot_longer(5:52, names_to = "bin", values_to = "pupil")
pupiv$BinSeq <- pupiv$bin %>% str_replace_all("BIN", " ") %>% as.numeric()
pupiv$Time <- 125 * (pupiv$BinSeq - 1)
mycolors <- c("goldenrod2", "black")
pupit <- ggplot(pupiv, aes(Time, pupil, color = COND)) + stat_summary(fun.data = mean_se,
geom = "pointrange", lwid = 0.5, size = 0.3) + scale_color_manual(values = mycolors)
print(pupit)
Here’s a spaghetti plot illustrating change from baseline to followup in a cohort of 30 adults. This has a horizontal line annotation marking threshold for impairment on the Montreal Cognitive Assessment.
spag <- read.csv("Spaghetti.csv", header = T)
ggplot(spag, aes(x = time, y = value, color = Participant)) + geom_point() + geom_line() +
scale_y_continuous(breaks = seq(10, 30, 2), limits = c(10, 30)) + geom_hline(yintercept = 25,
color = "black", size = 1, lty = 5, alpha = 0.6) + jamie.theme + facet_wrap(~GroupImpair)
Here’s a nice alternative to boxplots, reflecting individual differences in average uncorrected pupil diameters for a bunch of neurotypical adults measured in very bright, medium, and dark ambient light. Conditions are colored by a custom vector (in code block below as ‘newvec’). Here are the first 5 rows of the molten dataframe.
scat <- read.csv("Exp2_PlotUncorrected.csv", header = T)
s <- scat %>% pivot_longer(2:4, names_to = "light", values_to = "pupil")
newvec <- c("black", "red", "yellow")
ggplot(s, aes(light, pupil, shape = light, fill = light)) + geom_point(shape = 21,
color = "black", size = 2.3, alpha = 0.6, position = position_jitter(w = 0.03,
h = 0)) + scale_fill_manual(values = newvec) + ylab("Raw Pupil Size (mm)") +
ylim(c(2, 5)) + stat_summary(fun.y = mean, fun.ymin = mean, fun.ymax = mean,
geom = "crossbar", color = "black", size = 0.4, width = 0.3) + jamie.theme
Mostly linear relation with a wee bit of random jitter around each observation, fitting an LM
test <- data.frame(ground = seq(1:150))
g2 <- test$ground^2
g3 <- test$ground^3
set.seed(127)
test$linear <- jitter(test$ground * 5, 30)
ggplot(test, aes(ground, linear)) + geom_point(shape = 2, size = 2, alpha = 0.7) +
xlab("Base") + ylab("linear") + geom_smooth(method = "lm", color = "seagreen4") +
jamie.theme + ggtitle("linear relation")
ggplot(test, aes(ground, g3)) + geom_point(shape = 2, size = 2, color = "black",
alpha = 0.7) + xlab("Base") + ylab("cubic") + geom_smooth(method = "lm", color = "seagreen4") +
jamie.theme + ggtitle("cubic relation")
This scatterplot shows the correlation between concreteness and imageability across many nouns from the MRC Psycholinguistic Database. For some reason CNC gets read in as a factor and must be coerced to integer for this to work. The trendline here uses a Loess function.
cnc <- read.csv("ImgBigSet.csv", header = T)
cnc$CNC <- as.integer(cnc$CNC)
ggplot(cnc, aes(x = CNC, y = IMG)) + geom_point(shape = 2, size = 1, color = "Blue",
alpha = 0.7) + stat_smooth(method = glm, color = "red") + xlab("Concreteness") +
ylab("Imageability") + scale_x_continuous(breaks = seq(0, 600, by = 100)) + scale_y_continuous(breaks = seq(0,
600, by = 100)) + jamie.theme
My custom theme for ggplot2 is a minimalist home brew of a theme for ggplot2. I’ll add this to most of the plots to follow. It strips panel gridlines and all sorts of other default junk. To apply different themes, we will work from the ggthemes package.
jamie.theme <- theme_bw() + theme(axis.line = element_line(colour = "black"), panel.grid.minor = element_blank(),
panel.grid.major = element_blank(), panel.border = element_blank(), panel.background = element_blank(),
legend.title = element_blank())
pupit + theme_minimal() + ggtitle("minimal")
pupit + theme_classic() + ggtitle("classic")
pupit + theme_linedraw() + ggtitle("linedraw")
pupit + theme_void() + ggtitle("void")
pupit + theme_dark() + ggtitle("dark")
Let’s take apart some the theme elements
pupit + ggtitle("default theme is theme_gray")
pupit + theme(axis.line = element_line(colour = "black"), panel.background = element_blank()) +
ggtitle("no gridlines, no background")
pupit + theme_minimal(base_family = "Helvetica") + ggtitle("font change to Helvetica")
# tbd
Let’s look at the same plot with lots of different theme options from the ggthemes package
pupit + theme_economist() + ggtitle("economist")
pupit + theme_wsj() + ggtitle("wall street journal")
pupit + theme_pander() + ggtitle("pander")
pupit + theme_fivethirtyeight() + ggtitle("538")
Here’s a time series representing continuous sampling of pupil diameter.
pupil <- read.csv("PupilRise.csv", header = T)
pupil.ts <- as.ts(pupil) #recodes the first column into a time series (3000 observations of pupil diameter fluctuations)
# Creates a time series plot of the continuously sampled pupil diameter
# measurements, scales the y-axis from 2 to 8mm, colors the line red, relabels
# the axes
plot.ts(pupil.ts, ylim = c(3, 5), xlim = c(0, 3000), ylab = "pupil dm (mm)", xlab = "samples",
col = "red")
Apply a simple moving average to the original time series and then replot it. TTR’s simple moving average (SMA) function averages each new datapoint with the N data points before it to create a new smoothed time series. This takes care of “weird” artifacts like your eyetracker behaving crazily. Here’s what smoothing at a backward window of 10 items looks like – less jagged than the original.
ts.smooth <- ts(SMA(pupil$size, n = 10))
plot.ts(ts.smooth, ylim = c(3, 5), xlim = c(0, 3000), ylab = "pupil dm (mm)", xlab = "samples/time",
col = "blue", lwid = 3)
The sections below represent various plot types, aesthetics, themes, and helpful snippets.
Creating and applying custom color vectors
This plot also has some nice jitter, colors, and opacity on the points.These data represent different measures of pupil size. Some steps in structuring the data….
scat2 <- read.csv("Exp2_Scattersetup_2.0.csv", header = T)
colnames(scat2)
## [1] "participant" "high" "low" "mid" "condition"
sc <- scat2 %>% pivot_longer(2:4, names_to = "variable", values_to = "value") #to long form
sc$variable <- factor(sc$variable, levels = c("low", "mid", "high")) #reorder the factors
newvec <- c("black", "red", "yellow")
ggplot(sc, aes(variable, value, shape = variable, fill = variable)) + geom_point(shape = 21,
color = "black", size = 2.3, alpha = 0.6, position = position_jitter(w = 0.17,
h = 0)) + scale_fill_manual(values = newvec) + facet_wrap(~condition, scales = "free_y") +
ylab("Pupil Dilation (mm)") #NB all the y-axis have different scales and must scale free.